/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

/**
 * Benchmarking Lucene By Tasks
 *
 * <p>This package provides "task based" performance benchmarking of Lucene. One can use the
 * predefined benchmarks, or create new ones.
 *
 * <p>Contained packages:
 *
 * <table class="padding4" style="border: 1px solid">
 *  <caption>table of benchmark packages</caption>
 *  <tr>
 *    <td><b>Package</b></td>
 *    <td><b>Description</b></td>
 *  </tr>
 *  <tr>
 *    <td><a href="stats/package-summary.html">stats</a></td>
 *    <td>Statistics maintained when running benchmark tasks.</td>
 *  </tr>
 *  <tr>
 *    <td><a href="tasks/package-summary.html">tasks</a></td>
 *    <td>Benchmark tasks.</td>
 *  </tr>
 *  <tr>
 *    <td><a href="feeds/package-summary.html">feeds</a></td>
 *    <td>Sources for benchmark inputs: documents and queries.</td>
 *  </tr>
 *  <tr>
 *    <td><a href="utils/package-summary.html">utils</a></td>
 *    <td>Utilities used for the benchmark, and for the reports.</td>
 *  </tr>
 *  <tr>
 *    <td><a href="programmatic/package-summary.html">programmatic</a></td>
 *    <td>Sample performance test written programmatically.</td>
 *  </tr>
 * </table>
 *
 * <h2>Table Of Contents</h2>
 *
 * <ol>
 *   <li><a href="#concept">Benchmarking By Tasks</a>
 *   <li><a href="#usage">How to use</a>
 *   <li><a href="#algorithm">Benchmark "algorithm"</a>
 *   <li><a href="#tasks">Supported tasks/commands</a>
 *   <li><a href="#properties">Benchmark properties</a>
 *   <li><a href="#example">Example input algorithm and the result benchmark report.</a>
 *   <li><a href="#recsCounting">Results record counting clarified</a>
 * </ol>
 *
 * <a id="concept"></a>
 *
 * <h2>Benchmarking By Tasks</h2>
 *
 * <p>Benchmark Lucene using task primitives.
 *
 * <p>A benchmark is composed of some predefined tasks, allowing for creating an index, adding
 * documents, optimizing, searching, generating reports, and more. A benchmark run takes an
 * "algorithm" file that contains a description of the sequence of tasks making up the run, and some
 * properties defining a few additional characteristics of the benchmark run. <a id="usage"></a>
 *
 * <h2>How to use</h2>
 *
 * <p>The easiest way to run a benchmarks is using gradle:
 *
 * <ul>
 *   <li>./gradlew -p lucene/benchmark getReuters run <br>
 *       - would run the <code>micro-standard.alg</code> "algorithm".
 *   <li>./gradlew -p lucene/benchmark getReuters run -PtaskAlg=conf/compound-penalty.alg <br>
 *       - would run the <code>compound-penalty.alg</code> "algorithm".
 *   <li>./gradlew -p lucene/benchmark getReuters run -PtaskAlg=[full-path-to-your-alg-file] <br>
 *       - would run <code>your perf test</code> "algorithm".
 *   <li>java org.apache.lucene.benchmark.byTask.programmatic.Sample <br>
 *       - would run a performance test programmatically - without using an alg file. This is less
 *       readable, and less convenient, but possible.
 * </ul>
 *
 * <p>You may find existing tasks sufficient for defining the benchmark <i>you</i> need, otherwise,
 * you can extend the framework to meet your needs, as explained herein.
 *
 * <p>Each benchmark run has a DocMaker and a QueryMaker. These two should usually match, so that
 * "meaningful" queries are used for a certain collection. Properties set at the header of the alg
 * file define which "makers" should be used. You can also specify your own makers, extending
 * DocMaker and implementing QueryMaker.
 *
 * <blockquote>
 *
 * <b>Note:</b> since 2.9, DocMaker is a concrete class which accepts a ContentSource. In most
 * cases, you can use the DocMaker class to create Documents, while providing your own ContentSource
 * implementation. For example, the current Benchmark package includes ContentSource implementations
 * for TREC, Enwiki and Reuters collections, as well as others like LineDocSource which reads a
 * 'line' file produced by WriteLineDocTask.
 *
 * </blockquote>
 *
 * <p>Benchmark .alg file contains the benchmark "algorithm". The syntax is described below. Within
 * the algorithm, you can specify groups of commands, assign them names, specify commands that
 * should be repeated, do commands in serial or in parallel, and also control the speed of "firing"
 * the commands.
 *
 * <p>This allows, for instance, to specify that an index should be opened for update, documents
 * should be added to it one by one but not faster than 20 docs a minute, and, in parallel with
 * this, some N queries should be searched against that index, again, no more than 2 queries a
 * second. You can have the searches all share an index reader, or have them each open its own
 * reader and close it afterwords.
 *
 * <p>If the commands available for use in the algorithm do not meet your needs, you can add
 * commands by adding a new task under org.apache.lucene.benchmark.byTask.tasks - you should extend
 * the PerfTask abstract class. Make sure that your new task class name is suffixed by Task. Assume
 * you added the class "WonderfulTask" - doing so also enables the command "Wonderful" to be used in
 * the algorithm.
 *
 * <p><u>External classes</u>: It is sometimes useful to invoke the benchmark package with your
 * external alg file that configures the use of your own doc/query maker and or html parser. You can
 * work this out without modifying the benchmark package code, by passing your class path with the
 * benchmark.ext.classpath property:
 *
 * <ul>
 *   <li>./gradlew -p lucene/benchmark run -PtaskAlg=[full-path-to-your-alg-file] <span
 *       style="color: #FF0000">-Dbenchmark.ext.classpath=/mydir/classes </span> -Dtask.mem=512M
 * </ul>
 *
 * <p><u>External tasks</u>: When writing your own tasks under a package other than
 * <b>org.apache.lucene.benchmark.byTask.tasks</b> specify that package thru the <span style="color:
 * #FF0000">alt.tasks.packages</span> property.
 *
 * <p><a id="algorithm"></a>
 *
 * <h2>Benchmark "algorithm"</h2>
 *
 * <p>The following is an informal description of the supported syntax.
 *
 * <ol>
 *   <li><b>Measuring</b>: When a command is executed, statistics for the elapsed execution time and
 *       memory consumption are collected. At any time, those statistics can be printed, using one
 *       of the available ReportTasks.
 *   <li><b>Comments</b> start with '<span style="color: #FF0066">#</span>'.
 *   <li><b>Serial</b> sequences are enclosed within '<span style="color: #FF0066">{ }</span>'.
 *   <li><b>Parallel</b> sequences are enclosed within '<span style="color: #FF0066">[ ]</span>'
 *   <li><b>Sequence naming:</b> To name a sequence, put '<span style="color:
 *       #FF0066">"name"</span>' just after '<span style="color: #FF0066">{</span>' or '<span
 *       style="color: #FF0066">[</span>'. <br>
 *       Example - <span style="color: #FF0066">{ "ManyAdds" AddDoc } : 1000000</span> - would name
 *       the sequence of 1M add docs "ManyAdds", and this name would later appear in statistic
 *       reports. If you don't specify a name for a sequence, it is given one: you can see it as the
 *       algorithm is printed just before benchmark execution starts.
 *   <li><b>Repeating</b>: To repeat sequence tasks N times, add '<span style="color: #FF0066">:
 *       N</span>' just after the sequence closing tag - '<span style="color: #FF0066">}</span>' or
 *       '<span style="color: #FF0066">]</span>' or '<span style="color: #FF0066">&gt;</span>'. <br>
 *       Example - <span style="color: #FF0066">[ AddDoc ] : 4</span> - would do 4 addDoc in
 *       parallel, spawning 4 threads at once. <br>
 *       Example - <span style="color: #FF0066">[ AddDoc AddDoc ] : 4</span> - would do 8 addDoc in
 *       parallel, spawning 8 threads at once. <br>
 *       Example - <span style="color: #FF0066">{ AddDoc } : 30</span> - would do addDoc 30 times in
 *       a row. <br>
 *       Example - <span style="color: #FF0066">{ AddDoc AddDoc } : 30</span> - would do addDoc 60
 *       times in a row. <br>
 *       <b>Exhaustive repeating</b>: use <span style="color: #FF0066">*</span> instead of a number
 *       to repeat exhaustively. This is sometimes useful, for adding as many files as a doc maker
 *       can create, without iterating over the same file again, especially when the exact number of
 *       documents is not known in advance. For instance, TREC files extracted from a zip file.
 *       Note: when using this, you must also set <span style="color:
 *       #FF0066">content.source.forever</span> to false. <br>
 *       Example - <span style="color: #FF0066">{ AddDoc } : *</span> - would add docs until the doc
 *       maker is "exhausted".
 *   <li><b>Command parameter</b>: a command can optionally take a single parameter. If the certain
 *       command does not support a parameter, or if the parameter is of the wrong type, reading the
 *       algorithm will fail with an exception and the test would not start. Currently the following
 *       tasks take optional parameters:
 *       <ul>
 *         <li><b>AddDoc</b> takes a numeric parameter, indicating the required size of added
 *             document. Note: if the DocMaker implementation used in the test does not support
 *             makeDoc(size), an exception would be thrown and the test would fail.
 *         <li><b>DeleteDoc</b> takes numeric parameter, indicating the docid to be deleted. The
 *             latter is not very useful for loops, since the docid is fixed, so for deletion in
 *             loops it is better to use the <code>doc.delete.step</code> property.
 *         <li><b>SetProp</b> takes a <code>name,value</code> mandatory param, ',' used as a
 *             separator.
 *         <li><b>SearchTravRetTask</b> and <b>SearchTravTask</b> take a numeric parameter,
 *             indicating the required traversal size.
 *         <li><b>SearchTravRetLoadFieldSelectorTask</b> takes a string parameter: a comma separated
 *             list of Fields to load.
 *         <li><b>SearchTravRetHighlighterTask</b> takes a string parameter: a comma separated list
 *             of parameters to define highlighting. See that tasks javadocs for more information
 *       </ul>
 *       <br>
 *       Example - <span style="color: #FF0066">AddDoc(2000)</span> - would add a document of size
 *       2000 (~bytes). <br>
 *       See conf/sample.alg for how this can be used, for instance, to check which is faster,
 *       adding many smaller documents, or few larger documents. Next candidates for supporting a
 *       parameter may be the Search tasks, for controlling the query size.
 *   <li><b>Statistic recording elimination</b>: - a sequence can also end with '<span style="color:
 *       #FF0066">&gt;</span>', in which case child tasks would not store their statistics. This can
 *       be useful to avoid exploding stats data, for adding say 1M docs. <br>
 *       Example - <span style="color: #FF0066">{ "ManyAdds" AddDoc &gt; : 1000000</span> - would
 *       add million docs, measure that total, but not save stats for each addDoc. <br>
 *       Notice that the granularity of System.currentTimeMillis() (which is used here) is system
 *       dependant, and in some systems an operation that takes 5 ms to complete may show 0 ms
 *       latency time in performance measurements. Therefore it is sometimes more accurate to look
 *       at the elapsed time of a larger sequence, as demonstrated here.
 *   <li><b>Rate</b>: To set a rate (ops/sec or ops/min) for a sequence, add '<span style="color:
 *       #FF0066">: N : R</span>' just after sequence closing tag. This would specify repetition of
 *       N with rate of R operations/sec. Use '<span style="color: #FF0066">R/sec</span>' or '<span
 *       style="color: #FF0066">R/min</span>' to explicitly specify that the rate is per second or
 *       per minute. The default is per second. <br>
 *       Example - <span style="color: #FF0066">[ AddDoc ] : 400 : 3</span> - would do 400 addDoc in
 *       parallel, starting up to 3 threads per second. <br>
 *       Example - <span style="color: #FF0066">{ AddDoc } : 100 : 200/min</span> - would do 100
 *       addDoc serially, waiting before starting next add, if otherwise rate would exceed 200
 *       adds/min.
 *   <li><b>Disable Counting</b>: Each task executed contributes to the records count. This count is
 *       reflected in reports under recs/s and under recsPerRun. Most tasks count 1, some count 0,
 *       and some count more. (See <a href="#recsCounting">Results record counting clarified</a> for
 *       more details.) It is possible to disable counting for a task by preceding it with <span
 *       style="color: #FF0066">-</span>. <br>
 *       Example - <span style="color: #FF0066"> -CreateIndex </span> - would count 0 while the
 *       default behavior for CreateIndex is to count 1.
 *   <li><b>Command names</b>: Each class "AnyNameTask" in the package
 *       org.apache.lucene.benchmark.byTask.tasks, that extends PerfTask, is supported as command
 *       "AnyName" that can be used in the benchmark "algorithm" description. This allows to add new
 *       commands by just adding such classes.
 * </ol>
 *
 * <a id="tasks"></a>
 *
 * <h2>Supported tasks/commands</h2>
 *
 * <p>Existing tasks can be divided into a few groups: regular index/search work tasks, report
 * tasks, and control tasks.
 *
 * <ol>
 *   <li><b>Report tasks</b>: There are a few Report commands for generating reports. Only task runs
 *       that were completed are reported. (The 'Report tasks' themselves are not measured and not
 *       reported.)
 *       <ul>
 *         <li><span style="color: #FF0066">RepAll</span> - all (completed) task runs.
 *         <li><span style="color: #FF0066">RepSumByName</span> - all statistics, aggregated by
 *             name. So, if AddDoc was executed 2000 times, only 1 report line would be created for
 *             it, aggregating all those 2000 statistic records.
 *         <li><span style="color: #FF0066">RepSelectByPref &nbsp; prefixWord</span> - all records
 *             for tasks whose name starts with <span style="color: #FF0066">prefixWord</span>.
 *         <li><span style="color: #FF0066">RepSumByPref &nbsp; prefixWord</span> - all records for
 *             tasks whose name starts with <span style="color: #FF0066">prefixWord</span>,
 *             aggregated by their full task name.
 *         <li><span style="color: #FF0066">RepSumByNameRound</span> - all statistics, aggregated by
 *             name and by <span style="color: #FF0066">Round</span>. So, if AddDoc was executed
 *             2000 times in each of 3 <span style="color: #FF0066">rounds</span>, 3 report lines
 *             would be created for it, aggregating all those 2000 statistic records in each round.
 *             See more about rounds in the <span style="color: #FF0066">NewRound</span> command
 *             description below.
 *         <li><span style="color: #FF0066">RepSumByPrefRound &nbsp; prefixWord</span> - similar to
 *             <span style="color: #FF0066">RepSumByNameRound</span>, just that only tasks whose
 *             name starts with <span style="color: #FF0066">prefixWord</span> are included.
 *       </ul>
 *       If needed, additional reports can be added by extending the abstract class ReportTask, and
 *       by manipulating the statistics data in Points and TaskStats.
 *   <li><b>Control tasks</b>: Few of the tasks control the benchmark algorithm all over:
 *       <ul>
 *         <li><span style="color: #FF0066">ClearStats</span> - clears the entire statistics.
 *             Further reports would only include task runs that would start after this call.
 *         <li><span style="color: #FF0066">NewRound</span> - virtually start a new round of
 *             performance test. Although this command can be placed anywhere, it mostly makes sense
 *             at the end of an outermost sequence. <br>
 *             This increments a global "round counter". All task runs that would start now would
 *             record the new, updated round counter as their round number. This would appear in
 *             reports. In particular, see <span style="color: #FF0066">RepSumByNameRound</span>
 *             above. <br>
 *             An additional effect of NewRound, is that numeric and boolean properties defined (at
 *             the head of the .alg file) as a sequence of values, e.g. <span style="color:
 *             #FF0066"> merge.factor=mrg:10:100:10:100</span> would increment (cyclic) to the next
 *             value. Note: this would also be reflected in the reports, in this case under a column
 *             that would be named "mrg".
 *         <li><span style="color: #FF0066">ResetInputs</span> - DocMaker and the various
 *             QueryMakers would reset their counters to start. The way these Maker interfaces work,
 *             each call for makeDocument() or makeQuery() creates the next document or query that
 *             it "knows" to create. If that pool is "exhausted", the "maker" start over again. The
 *             ResetInputs command therefore allows to make the rounds comparable. It is therefore
 *             useful to invoke ResetInputs together with NewRound.
 *         <li><span style="color: #FF0066">ResetSystemErase</span> - reset all index and input data
 *             and call gc. Does NOT reset statistics. This contains ResetInputs. All
 *             writers/readers are nullified, deleted, closed. Index is erased. Directory is erased.
 *             You would have to call CreateIndex once this was called...
 *         <li><span style="color: #FF0066">ResetSystemSoft</span> - reset all index and input data
 *             and call gc. Does NOT reset statistics. This contains ResetInputs. All
 *             writers/readers are nullified, closed. Index is NOT erased. Directory is NOT erased.
 *             This is useful for testing performance on an existing index, for instance if the
 *             construction of a large index took a very long time and now you would to test its
 *             search or update performance.
 *       </ul>
 *   <li>Other existing tasks are quite straightforward and would just be briefly described here.
 *       <ul>
 *         <li><span style="color: #FF0066">CreateIndex</span> and <span style="color:
 *             #FF0066">OpenIndex</span> both leave the index open for later update operations.
 *             <span style="color: #FF0066">CloseIndex</span> would close it.
 *         <li><span style="color: #FF0066">OpenReader</span>, similarly, would leave an index
 *             reader open for later search operations. But this has further semantics. If a Read
 *             operation is performed, and an open reader exists, it would be used. Otherwise, the
 *             read operation would open its own reader and close it when the read operation is
 *             done. This allows testing various scenarios - sharing a reader, searching with "cold"
 *             reader, with "warmed" reader, etc. The read operations affected by this are: <span
 *             style="color: #FF0066">Warm</span>, <span style="color: #FF0066">Search</span>, <span
 *             style="color: #FF0066">SearchTrav</span> (search and traverse), and <span
 *             style="color: #FF0066">SearchTravRet</span> (search and traverse and retrieve).
 *             Notice that each of the 3 search task types maintains its own queryMaker instance.
 *         <li><span style="color: #FF0066">CommitIndex</span> and <span style="color:
 *             #FF0066">ForceMerge</span> can be used to commit changes to the index then merge the
 *             index segments. The integer parameter specifies how many segments to merge down to
 *             (default 1).
 *         <li><span style="color: #FF0066">WriteLineDoc</span> prepares a 'line' file where each
 *             line holds a document with <i>title</i>, <i>date</i> and <i>body</i> elements,
 *             separated by [TAB]. A line file is useful if one wants to measure pure indexing
 *             performance, without the overhead of parsing the data.<br>
 *             You can use LineDocSource as a ContentSource over a 'line' file.
 *         <li><span style="color: #FF0066">ConsumeContentSource</span> consumes a ContentSource.
 *             Useful for e.g. testing a ContentSource performance, without the overhead of
 *             preparing a Document out of it.
 *       </ul>
 * </ol>
 *
 * <a id="properties"></a>
 *
 * <h2>Benchmark properties</h2>
 *
 * <p>Properties are read from the header of the .alg file, and define several parameters of the
 * performance test. As mentioned above for the <span style="color: #FF0066">NewRound</span> task,
 * numeric and boolean properties that are defined as a sequence of values, e.g. <span style="color:
 * #FF0066">merge.factor=mrg:10:100:10:100</span> would increment (cyclic) to the next value, when
 * NewRound is called, and would also appear as a named column in the reports (column name would be
 * "mrg" in this example).
 *
 * <p>Some of the currently defined properties are:
 *
 * <ol>
 *   <li><span style="color: #FF0066">analyzer</span> - full class name for the analyzer to use.
 *       Same analyzer would be used in the entire test.
 *   <li><span style="color: #FF0066">directory</span> - valid values are This tells which directory
 *       to use for the performance test.
 *   <li><b>Index work parameters</b>: Multi int/boolean values would be iterated with calls to
 *       NewRound. There would be also added as columns in the reports, first string in the sequence
 *       is the column name. (Make sure it is not shorter than any value in the sequence).
 *       <ul>
 *         <li><span style="color: #FF0066">max.buffered</span> <br>
 *             Example: max.buffered=buf:10:10:100:100 - this would define using maxBufferedDocs of
 *             10 in iterations 0 and 1, and 100 in iterations 2 and 3.
 *         <li><span style="color: #FF0066">merge.factor</span> - which merge factor to use.
 *         <li><span style="color: #FF0066">compound</span> - whether the index is using the
 *             compound format or not. Valid values are "true" and "false".
 *       </ul>
 * </ol>
 *
 * <p>Here is a list of currently defined properties:
 *
 * <ol>
 *   <li><b>Root directory for data and indexes:</b>
 *       <ul>
 *         <li>work.dir (default is System property "benchmark.work.dir" or "work".)
 *       </ul>
 *   <li><b>Docs and queries creation:</b>
 *       <ul>
 *         <li>analyzer
 *         <li>doc.maker
 *         <li>content.source.forever
 *         <li>html.parser
 *         <li>doc.stored
 *         <li>doc.tokenized
 *         <li>doc.term.vector
 *         <li>doc.term.vector.positions
 *         <li>doc.term.vector.offsets
 *         <li>doc.store.body.bytes
 *         <li>docs.dir
 *         <li>query.maker
 *         <li>file.query.maker.file
 *         <li>file.query.maker.default.field
 *         <li>search.num.hits
 *       </ul>
 *   <li><b>Logging</b>:
 *       <ul>
 *         <li>log.step
 *         <li>log.step.[class name]Task ie log.step.DeleteDoc (e.g. log.step.Wonderful for the
 *             WonderfulTask example above).
 *         <li>log.queries
 *         <li>task.max.depth.log
 *       </ul>
 *   <li><b>Index writing</b>:
 *       <ul>
 *         <li>compound
 *         <li>merge.factor
 *         <li>max.buffered
 *         <li>directory
 *         <li>ram.flush.mb
 *         <li>codec.postingsFormat (eg Direct) Note: no codec should be specified through
 *             default.codec
 *       </ul>
 *   <li><b>Doc deletion</b>:
 *       <ul>
 *         <li>doc.delete.step
 *       </ul>
 *   <li><b>Spatial</b>: Numerous; see spatial.alg
 *   <li><b>Task alternative packages</b>:
 *       <ul>
 *         <li>alt.tasks.packages - comma separated list of additional packages where tasks classes
 *             will be looked for when not found in the default package (that of PerfTask). If the
 *             same task class appears in more than one package, the package indicated first in this
 *             list will be used.
 *       </ul>
 * </ol>
 *
 * <p>For sample use of these properties see the *.alg files under conf. <a id="example"></a>
 *
 * <h2>Example input algorithm and the result benchmark report</h2>
 *
 * <p>The following example is in conf/sample.alg:
 *
 * <pre>
 * <span style="color: #003333"># --------------------------------------------------------
 * #
 * # Sample: what is the effect of doc size on indexing time?
 * #
 * # There are two parts in this test:
 * # - PopulateShort adds 2N documents of length  L
 * # - PopulateLong  adds  N documents of length 2L
 * # Which one would be faster?
 * # The comparison is done twice.
 * #
 * # --------------------------------------------------------
 * </span>
 * <span style="color: #990066"># -------------------------------------------------------------------------------------
 * # multi val params are iterated by NewRound's, added to reports, start with column name.
 * merge.factor=mrg:10:20
 * max.buffered=buf:100:1000
 * compound=true
 *
 * analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
 * directory=FSDirectory
 *
 * doc.stored=true
 * doc.tokenized=true
 * doc.term.vector=false
 * doc.add.log.step=500
 *
 * docs.dir=reuters-out
 *
 * doc.maker=org.apache.lucene.benchmark.byTask.feeds.SimpleDocMaker
 *
 * query.maker=org.apache.lucene.benchmark.byTask.feeds.SimpleQueryMaker
 *
 * # task at this depth or less would print when they start
 * task.max.depth.log=2
 *
 * log.queries=false
 * # -------------------------------------------------------------------------------------</span>
 * <span style="color: #3300FF">{
 *
 *     { "PopulateShort"
 *         CreateIndex
 *         { AddDoc(4000) &gt; : 20000
 *         Optimize
 *         CloseIndex
 *     &gt;
 *
 *     ResetSystemErase
 *
 *     { "PopulateLong"
 *         CreateIndex
 *         { AddDoc(8000) &gt; : 10000
 *         Optimize
 *         CloseIndex
 *     &gt;
 *
 *     ResetSystemErase
 *
 *     NewRound
 *
 * } : 2
 *
 * RepSumByName
 * RepSelectByPref Populate
 * </span>
 * </pre>
 *
 * <p>The command line for running this sample: <br>
 * <code>./gradlew -p lucene/benchmark getReuters run -PtaskAlg=conf/sample.alg</code>
 *
 * <p>The output report from running this test contains the following:
 *
 * <pre>
 * Operation     round mrg  buf   runCnt   recsPerRun        rec/s  elapsedSec    avgUsedMem    avgTotalMem
 * PopulateShort     0  10  100        1        20003        119.6      167.26    12,959,120     14,241,792
 * PopulateLong -  - 0  10  100 -  -   1 -  -   10003 -  -  - 74.3 -  - 134.57 -  17,085,208 -   20,635,648
 * PopulateShort     1  20 1000        1        20003        143.5      139.39    63,982,040     94,756,864
 * PopulateLong -  - 1  20 1000 -  -   1 -  -   10003 -  -  - 77.0 -  - 129.92 -  87,309,608 -  100,831,232
 * </pre>
 *
 * <a id="recsCounting"></a>
 *
 * <h2>Results record counting clarified</h2>
 *
 * <p>Two columns in the results table indicate records counts: records-per-run and
 * records-per-second. What does it mean?
 *
 * <p>Almost every task gets 1 in this count just for being executed. Task sequences aggregate the
 * counts of their child tasks, plus their own count of 1. So, a task sequence containing 5 other
 * task sequences, each running a single other task 10 times, would have a count of 1 + 5 * (1 + 10)
 * = 56.
 *
 * <p>The traverse and retrieve tasks "count" more: a traverse task would add 1 for each traversed
 * result (hit), and a retrieve task would additionally add 1 for each retrieved doc. So, regular
 * Search would count 1, SearchTrav that traverses 10 hits would count 11, and a SearchTravRet task
 * that retrieves (and traverses) 10, would count 21.
 *
 * <p>Confusing? this might help: always examine the <code>elapsedSec</code> column, and always
 * compare "apples to apples", .i.e. it is interesting to check how the <code>rec/s</code> changed
 * for the same task (or sequence) between two different runs, but it is not very useful to know how
 * the <code>rec/s</code> differs between <code>Search</code> and <code>SearchTrav</code> tasks. For
 * the latter, <code>elapsedSec</code> would bring more insight.
 */
package org.apache.lucene.benchmark.byTask;
